Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 590
Filter
1.
Oral Oncol ; 152: 106744, 2024 May.
Article in English | MEDLINE | ID: mdl-38520756

ABSTRACT

PURPOSE: In clinical practice the assessment of the "vocal cord-arytenoid unit" (VCAU) mobility is crucial in the staging, prognosis, and choice of treatment of laryngeal squamous cell carcinoma (LSCC). The aim of the present study was to measure repeatability and reliability of clinical assessment of VCAU mobility and radiologic analysis of posterior laryngeal extension. METHODS: In this multi-institutional retrospective study, patients with LSCC-induced impairment of VCAU mobility who received curative treatment were included; pre-treatment endoscopy and contrast-enhanced imaging were collected and evaluated by raters. According to their evaluations, concordance, number of assigned categories, and inter- and intra-rater agreement were calculated. RESULTS: Twenty-two otorhinolaryngologists evaluated 366 videolaryngoscopies (total evaluations: 2170) and 6 radiologists evaluated 237 imaging studies (total evaluations: 477). The concordance of clinical rating was excellent in only 22.7% of cases. Overall, inter- and intra-rater agreement was weak. Supraglottic cancers and transoral endoscopy were associated with the lowest inter-observer reliability values. Radiologic inter-rater agreement was low and did not vary with imaging technique. Intra-rater reliability of radiologic evaluation was optimal. CONCLUSIONS: The current methods to assess VCAU mobility and posterior extension of LSCC are flawed by weak inter-observer agreement and reliability. Radiologic evaluation was characterized by very high intra-rater agreement, but weak inter-observer reliability. The relevance of VCAU mobility assessment in laryngeal oncology should be re-weighted. Patients affected by LSCC requiring imaging should be referred to dedicated radiologists with experience in head and neck oncology.


Subject(s)
Laryngeal Neoplasms , Vocal Cords , Humans , Laryngeal Neoplasms/diagnostic imaging , Laryngeal Neoplasms/pathology , Male , Female , Middle Aged , Aged , Retrospective Studies , Vocal Cords/diagnostic imaging , Vocal Cords/physiopathology , Adult , Reproducibility of Results , Aged, 80 and over , Laryngoscopy/methods , Carcinoma, Squamous Cell/diagnostic imaging , Carcinoma, Squamous Cell/pathology
2.
Laryngoscope ; 134(6): 2835-2843, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38217455

ABSTRACT

BACKGROUND: While videostroboscopy is recognized as the most popular approach for investigating vocal fold function, evaluating the numerical values, such as the membranous glottal gap area, remains too time consuming for clinical applications. METHODS: We used a total of 2507 videostroboscopy images from 137 patients and developed five U-Net-based deep-learning image segmentation models for automatic masking of the membranous glottal gap area. To further validate the models, we used another 410 images from 41 different patients. RESULTS: During development, all five models exhibited acceptable and similar metrics. While the VGG19 U-Net had a long inference time of 1654 ms, the other four models had more practical inference times, ranging from 16 to 138 ms. During further validation, Efficient U-Net demonstrated the highest intersection over union of 0.8455, the highest Dice coefficient of 0.9163, and the lowest Hausdorff distance of 1.5626. The normalized membranous glottal gap area index was also calculated and validated. Efficient U-Net and VGG19 U-Net exhibited the lowest mean squared errors (3.5476 and 3.3842) and the lowest mean absolute errors (1.8835 and 1.8396). CONCLUSIONS: Automatic segmentation of the membranous glottal gap area can be achieved through U-net-based architecture. Considering the segmentation quality and speed, Efficient U-Net is a reasonable choice for this task, while the other four models remain valuable competitors. The models' masked area enables possible calculation of the normalized membranous glottal gap area and analysis of the glottal area waveform, revealing promising clinical applications for this model. LEVEL OF EVIDENCE: NA Laryngoscope, 134:2835-2843, 2024.


Subject(s)
Glottis , Humans , Glottis/diagnostic imaging , Stroboscopy/methods , Deep Learning , Video Recording , Image Processing, Computer-Assisted/methods , Vocal Cords/diagnostic imaging , Vocal Cords/anatomy & histology , Male , Female
3.
Auris Nasus Larynx ; 51(1): 120-124, 2024 Feb.
Article in English | MEDLINE | ID: mdl-37164816

ABSTRACT

OBJECTIVE: Dysphonia is very common worldwide and aerosol drug inhalation is an important treatment for patients with dysphonia. This study aimed to explore the effects of vocal fold (VF) lesions on the particle deposition pattern using computational modeling. METHODS: A realistic mouth-throat (MT) model of a healthy adult was constructed based on computed tomography images. Small and large vocal fold lesions were incorporated in the original model. A steady inhalation flowrate of 15 and 30 liter per minute (LPM) was used as the velocity inlet and monodisperse particles with diameters of 5 to 10 µm were simulated. RESULTS: Particles of larger size are more likely to be deposited in MT models, most of them distributed in oral cavity, oropharynx and supraglottis. The ideal sizes at 30 LPM ranged over 7-10 µm for healthy VFs and 6-8 µm for VF lesions. The best sizes at 15 LPM ranged over 6-8 µm for healthy VFs and 8-9 µm for VF lesions. CONCLUSION: Based on this study, VF lesions influence the deposition pattern in the glottis obviously. The ideal sizes differ at the flow rates of 15 and 30 LPM.


Subject(s)
Dysphonia , Vocal Cords , Adult , Humans , Vocal Cords/diagnostic imaging , Pharynx , Respiratory Aerosols and Droplets , Administration, Inhalation , Computer Simulation , Mouth/diagnostic imaging
4.
J Pediatr Surg ; 59(1): 109-116, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37845124

ABSTRACT

PURPOSE: Vocal fold movement impairment (VFMI) secondary to recurrent laryngeal nerve (RLN) injury is a common source of morbidity after pediatric cervical, thoracic, and cardiac procedures. Flexible laryngoscopy (FL) is the gold standard to diagnose VFMI yet can be challenging to perform and/or risks possible clinical decompensation in some children and is an aerosolizing procedure. Laryngeal ultrasound (LUS) is a potential non-invasive alternative, but limited data exists in the pediatric surgical population regarding its efficacy. We aimed to investigate the diagnostic accuracy of LUS compared to FL in evaluating VFMI. METHODS: A prospective, single-center, single-blinded (rater) cohort study was undertaken on perioperative pediatric patients at risk for RLN injury. Patients underwent FL and LUS. Cohen's kappa was used to determine chance-corrected agreement. RESULTS: Between 2021 and 2023, 85 paired evaluations were performed with patients having a median (IQR) age of 10 (4, 42) months and weight of 7.5 (5.4, 13.4) kilograms. The prevalence of VFMI was 27.1%. Absolute agreement between evaluations was 98.8% (kappa 0.97, 95% CI: 0.91-1.00, P < 0.001). The sensitivity and specificity of LUS in detecting VFMI was 95.7% and 100%, yielding a positive predictive value (PPV) of 100% and negative predictive value (NPV) of 98.4% (95% CI: 90-100%). Diagnostic accuracy was 98.8% (95% CI: 93-100%). CONCLUSION: LUS is a highly accurate modality in evaluating VFMI in children. While FL remains the gold standard for diagnosis, LUS offers a low-risk screening modality for children at risk for VFMI such that only those with an abnormal LUS or presence of clinical symptoms discordant with LUS findings should undergo FL. TYPE OF STUDY: Prospective, single-center, single blinded (rater), cohort study. LEVEL OF EVIDENCE: Level II.


Subject(s)
Vocal Cord Paralysis , Vocal Cords , Humans , Child , Infant , Vocal Cords/diagnostic imaging , Vocal Cord Paralysis/diagnostic imaging , Vocal Cord Paralysis/epidemiology , Cohort Studies , Prospective Studies , Ultrasonography
5.
Laryngoscope ; 134(4): 1939-1944, 2024 Apr.
Article in English | MEDLINE | ID: mdl-37615373

ABSTRACT

INTRODUCTION: Vocal fold motion impairment (VFMI) is a known consequence after high-risk cardiac surgery. We implemented a universal laryngeal ultrasound (LUS) screening protocol for VFMI after the Norwood and aortic arch surgery. We hypothesized that LUS would accurately identify VFMI and predict postoperative aspiration. METHODS: We implemented a screening algorithm with LUS for patients undergoing high-risk cardiac surgery at a tertiary care pediatric hospital. Positively screened patients underwent flexible nasolaryngoscopy (FNL). Patients with an abnormal FNL underwent a video-fluoroscopic swallow study (VFSS). Patient demographics, length of stay, and swallowing outcomes were assessed. Two-tailed chi square and Wilcoxon rank sum tests were used to assess for differences. RESULTS: Sixty-seven patients underwent either Norwood or arch reconstruction over a 16-month period and underwent universal LUS. The average birth weight was 3.24 kg (SD 0.57). Of the 67 patients, VFMI was identified by LUS and 100% confirmed on FNL in 58.21% (n = 39/67) of patients. Aspiration and penetration on VFSS were higher in the group with VFMI as compared with those without VFMI (53.8% vs. 21.4%, p = 0.008). There was no difference in length of stay between patients who did not have a diagnosis of VFMI and those found to have VFMI (41.0 days vs 45.3 days p = 0.73). CONCLUSIONS: Universal LUS screening for patients following high-risk cardiac surgery may lead to earlier identification of postoperative VFMI and aspiration. Recognition of VFMI through this universal screening program could lead to earlier interventions and possibly improved swallowing outcomes. LEVEL OF EVIDENCE: 3 Laryngoscope, 134:1939-1944, 2024.


Subject(s)
Cardiac Surgical Procedures , Vocal Cord Paralysis , Humans , Child , Vocal Cords/diagnostic imaging , Vocal Cords/surgery , Vocal Cord Paralysis/diagnostic imaging , Vocal Cord Paralysis/etiology , Vocal Cord Paralysis/surgery , Cardiac Surgical Procedures/adverse effects , Respiratory Aspiration , Laryngoscopy , Retrospective Studies
6.
Eur Arch Otorhinolaryngol ; 281(4): 2055-2062, 2024 Apr.
Article in English | MEDLINE | ID: mdl-37695363

ABSTRACT

PURPOSE: To develop and validate a deep learning model for distinguishing healthy vocal folds (HVF) and vocal fold polyps (VFP) on laryngoscopy videos, while demonstrating the ability of a previously developed informative frame classifier in facilitating deep learning development. METHODS: Following retrospective extraction of image frames from 52 HVF and 77 unilateral VFP videos, two researchers manually labeled each frame as informative or uninformative. A previously developed informative frame classifier was used to extract informative frames from the same video set. Both sets of videos were independently divided into training (60%), validation (20%), and test (20%) by patient. Machine-labeled frames were independently verified by two researchers to assess the precision of the informative frame classifier. Two models, pre-trained on ResNet18, were trained to classify frames as containing HVF or VFP. The accuracy of the polyp classifier trained on machine-labeled frames was compared to that of the classifier trained on human-labeled frames. The performance was measured by accuracy and area under the receiver operating characteristic curve (AUROC). RESULTS: When evaluated on a hold-out test set, the polyp classifier trained on machine-labeled frames achieved an accuracy of 85% and AUROC of 0.84, whereas the classifier trained on human-labeled frames achieved an accuracy of 69% and AUROC of 0.66. CONCLUSION: An accurate deep learning classifier for vocal fold polyp identification was developed and validated with the assistance of a peer-reviewed informative frame classifier for dataset assembly. The classifier trained on machine-labeled frames demonstrates improved performance compared to the classifier trained on human-labeled frames.


Subject(s)
Deep Learning , Polyps , Humans , Laryngoscopy/methods , Vocal Cords/diagnostic imaging , Neural Networks, Computer , Retrospective Studies , Machine Learning , Polyps/diagnostic imaging
7.
Otolaryngol Head Neck Surg ; 170(4): 1099-1108, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38037413

ABSTRACT

OBJECTIVE: Accurate vocal cord leukoplakia classification is instructive for clinical diagnosis and surgical treatment. This article introduces a reliable very deep Siamese network for accurate vocal cord leukoplakia classification. STUDY DESIGN: A study of a classification network based on a retrospective database. SETTING: Academic university and hospital. METHODS: The white light image datasets of vocal cord leukoplakia used in this article were classified into 6 classes: normal tissues, inflammatory keratosis, mild dysplasia, moderate dysplasia, severe dysplasia, and squamous cell carcinoma. The classification performance was assessed by comparing it with 6 classical deep learning models, including AlexNet, VGG Net, Google Inception, ResNet, DenseNet, and Vision Transformer. RESULTS: Experiments show the superior classification performance of our proposed network compared to state-of-the-art methods. The overall accuracy is 0.9756. The values of sensitivity and specificity are very high as well. The confusion matrix provides information for the 6-class classification task and demonstrates the superiority of our proposed network. CONCLUSION: Our very deep Siamese network can provide accurate classification results of vocal cord leukoplakia, which facilitates early detection, clinical diagnosis, and surgical treatment. The excellent performance obtained in white light images can reduce the cost for patients, especially those living in developing countries.


Subject(s)
Laryngeal Diseases , Vocal Cords , Humans , Vocal Cords/diagnostic imaging , Vocal Cords/pathology , Retrospective Studies , Narrow Band Imaging/methods , Laryngeal Diseases/pathology , Endoscopy , Leukoplakia/pathology , Hyperplasia/pathology
8.
Article in English | MEDLINE | ID: mdl-38082565

ABSTRACT

Vocal folds motility evaluation is paramount in both the assessment of functional deficits and in the accurate staging of neoplastic disease of the glottis. Diagnostic endoscopy, and in particular videoendoscopy, is nowadays the method through which the motility is estimated. The clinical diagnosis, however, relies on the examination of the videoendoscopic frames, which is a subjective and professional-dependent task. Hence, a more rigorous, objective, reliable, and repeatable method is needed. To support clinicians, this paper proposes a machine learning (ML) approach for vocal cords motility classification. From the endoscopic videos of 186 patients with both vocal cords preserved motility and fixation, a dataset of 558 images relative to the two classes was extracted. Successively, a number of features was retrieved from the images and used to train and test four well-grounded ML classifiers. From test results, the best performance was achieved using XGBoost, with precision = 0.82, recall = 0.82, F1 score = 0.82, and accuracy = 0.82. After comparing the most relevant ML models, we believe that this approach could provide precise and reliable support to clinical evaluation.Clinical Relevance- This research represents an important advancement in the state-of-the-art of computer-assisted otolaryngology, to develop an effective tool for motility assessment in the clinical practice.


Subject(s)
Endoscopy , Vocal Cords , Humans , Vocal Cords/diagnostic imaging , Glottis , Videotape Recording , Machine Learning
9.
Article in English | MEDLINE | ID: mdl-38083520

ABSTRACT

Laryngeal high-speed video endoscopy is performed to examine the cycles of vocal fold vibrations in detail and to diagnose voice abnormalities. One of the recent image processing techniques for visualizing vocal fold vibration is optical flow-based playbacks, which include optical flow kymograms (OFKG) for local dynamics, optical flow glottovibrogram (OFGVG) and glottal optical flow waveforms (GOFW) for global dynamics. In recent times, various optical flow computing algorithms have been developed. In this paper, we used four well-known optical flow algorithms Horn Schunk, Lucas Kanade, Gunnar Farneback, and TVL1 to construct the optical flow playbacks. The proposed playback reliability is examined by comparing them to traditional representations such as Phonovibrogram (PVG). Since PVG and OFGVG are interconnected, a comparison study was carried out to better comprehend their interaction.Clinical Relevance- Both OFGVG and PVG add to the precision of interpreting pathological conditions by offering complementary information to the conventional spatiotemporal representations.


Subject(s)
Optic Flow , Vocal Cords , Vocal Cords/diagnostic imaging , Vocal Cords/pathology , Reproducibility of Results , Endoscopy , Glottis
10.
J Acoust Soc Am ; 154(6): 3595-3603, 2023 12 01.
Article in English | MEDLINE | ID: mdl-38038612

ABSTRACT

The messa di voce (MdV), which consists of a continuous crescendo and subsequent decrescendo on one pitch is one of the more difficult exercises of the technical repertoire of Western classical singing. With rising lung pressure, regulatory adjustments both on the level of the glottis and the vocal tract are required to keep the pitch stable. The dynamic changes of vocal tract dimensions with the bidirectional variation of sound pressure level (SPL) during MdV were analyzed by two-dimensional real-time magnetic resonance imaging (25 frames/s) and synchronous audio recordings in 12 professional singer subjects. Close associations in the respective articulatory kinetics were found between SPL and lip opening, jaw opening, pharynx width, uvula elevation, and vertical larynx position. However, changes in vocal tract dimensions during plateaus of SPL suggest that perceived loudness could have been varied beyond the dimension of SPL. Further multimodal investigation, including the analysis of sound spectra, is needed for a better understanding of the role of vocal tract resonances in the control of vocal loudness in human phonation.


Subject(s)
Larynx , Singing , Voice , Humans , Phonation , Larynx/diagnostic imaging , Sound , Vocal Cords/diagnostic imaging
11.
Medicine (Baltimore) ; 102(51): e36761, 2023 Dec 22.
Article in English | MEDLINE | ID: mdl-38134083

ABSTRACT

Airway procedures in life-threatening situations are vital for saving lives. Video laryngoscopy (VL) is commonly performed during endotracheal intubation (ETI) in the emergency department. Artificial intelligence (AI) is widely used in the medical field, particularly to detect anatomical structures. This study aimed to develop an AI algorithm that detects vocal cords from VL images acquired during emergent situations. This retrospective study used VL images acquired in the emergency department to facilitate the ETI. The vocal cord image was labeled with a ground-truth bounding box. The dataset was divided into training and validation datasets. The algorithm was developed from a training dataset using the YOLOv4 model. The performance of the algorithm was evaluated using a test set. The test set was further divided into specific environments during the ETI for clinical subgroup analysis. In total, 20,161 images from 84 patients were used in this study. A total of 10,287, 5766, and 4108 images were used for the model training, validation, and test sets, respectively. The developed algorithm achieved F1 score 0.906, sensitivity 0.963, and specificity 0.842 in the validation set. The performance in the test set was F1 score 0.808, sensitivity 0.823, and specificity 0.804. We developed and validated an AI algorithm to detect vocal cords in VL. This algorithm demonstrated a high performance. The algorithm can be used to determine the vocal cord to ensure safe ETI.


Subject(s)
Artificial Intelligence , Vocal Cords , Humans , Vocal Cords/diagnostic imaging , Laryngoscopy/methods , Retrospective Studies , Algorithms , Intubation, Intratracheal/adverse effects , Intubation, Intratracheal/methods
12.
Codas ; 35(6): e20220173, 2023.
Article in Portuguese, English | MEDLINE | ID: mdl-37909493

ABSTRACT

PURPOSE: To compare the frequency of vocal fold opening variation, analyzed by digital kymography, with the fundamental voice frequency obtained by acoustic analysis, in individuals without laryngeal alteration. METHODS: Observational analytical cross-sectional study. The participants were forty-eight women and 38 men from 18 to 55 years of age. The evaluation was made by voice acoustic analysis, by the habitual emission of the vowel /a/ for 3 seconds, and days of the week, and digital kymography (DKG), by the habitual emission of the vowels /i/ and /ɛ/. The measurements analyzed were acoustic fundamental frequency (f0), extracted by the Computerized Speech Lab (CSL) program, and dominant frequency of the variation of right (R-freq) and left (L-freq) vocal fold opening, obtained through the KIPS image processing program. The mounting of the kymograms consisted in the manual demarcation of the region by vertical lines delimiting width and horizontal lines separating the posterior, middle and anterior thirds of the Rima glottidis. In the statistical analysis, the Anderson-Darling test was used to verify the normality of the sample. The ANOVA and Tukey tests were performed for the comparison of measurements between the groups. For the comparison of age between the groups, the Mann-Whitney test was used. RESULTS: There are no differences between the values of the frequency measurement analyzed by digital kymography, with the acoustic fundamental frequency, in individuals without laryngeal alteration. CONCLUSION: The values of the dominant frequency of the vocal folds opening variation, as assessed by digital kymography, and the acoustic fundamental frequency of the voice are similar, allowing comparison between these measurements in the multidimensional evaluation of the voice, in individuals without laryngeal alteration.


OBJETIVO: Comparar a frequência da variação da abertura das pregas vocais, analisada pela videoquimografia digital, com a frequência fundamental da voz, obtida através da análise acústica, em indivíduos sem alteração laríngea. MÉTODO: Trata-se de um estudo observacional analítico transversal. Participaram 48 mulheres e 38 homens, de 18 a 55 anos. A avaliação foi composta por análise acústica da voz, obtida pela emissão habitual da vogal /a/ durante 3 segundos, e os dias da semana, e pela videoquimografia digital (DKG), obtida pela emissão habitual das vogais /i/ e /ɛ/. As medidas analisadas foram a frequência fundamental acústica (f0), extraída pelo programa Computerized Speech Lab (CSL), e a frequência dominante da variação de abertura da prega vocal direita (D-freq) e esquerda (E-freq), obtidas através do programa de processamento de imagens KIPS. A montagem dos quimogramas constou na demarcação manual da região, compostas por linhas verticais que delimitaram largura da prega vocal e linhas horizontais que marcaram os terços posterior, médio e anterior da rima glótica. Na análise estatística, o teste Anderson-Darling foi utilizado para verificar a normalidade da amostra. Os testes ANOVA e Tukey foram realizados para a comparação das medidas entre os grupos. Para a comparação da idade entre os grupos, foi utilizado o teste Mann-Whitney. RESULTADOS: Não existem diferenças entre os valores da medida de frequência analisada pela videoquimografia digital, com a frequência fundamental acústica, em indivíduos sem alteração laríngea. CONCLUSÃO: Os valores da frequência dominante da variação de abertura das pregas vocais, avaliada pela videoquimografia digital, e a frequência fundamental acústica da voz são similares, permitindo uma comparação entre estas medidas na avaliação multidimensional da voz, em indivíduos sem alteração laríngea.


Subject(s)
Phonation , Vocal Cords , Female , Humans , Male , Acoustics , Cross-Sectional Studies , Kymography/methods , Vibration , Vocal Cords/diagnostic imaging , Young Adult , Adult , Middle Aged
13.
PLoS One ; 18(10): e0293659, 2023.
Article in English | MEDLINE | ID: mdl-37903145

ABSTRACT

Oblique orientation of vocal cord demands strict compliance, by technicians and clinicians, to the recommended parallel plane CT scan of larynx. Repercussions of non-compliance has never been investigated before. We aimed to observe influence of non-parallel vocal cord plane CT scan on qualitative and quantitative glottic parameters, keeping parallel plane CT as a standard for comparison. Simultaneous identification of potential suboptimal imaging sequelae as a result of unformatted CT plane was also identified. In this study we included 95 normal adult glottides and retrospectively analyzed their anatomy in two axial planes, non-parallel plane ① and parallel to vocal cord plane ②. Qualitative (shape, structures at glottic level) and quantitative (anterior commissure ACom, vocal cord width VCw, anteroposterior AP, transverse Tr, cross-sectional area CSA) glottic variables were recorded. Multivariate statistical analysis was used to predict pattern and their impact on glottic anatomy. Plane ① displayed supraglottic features in glottis; adipose (90.5%) and split thyroid laminae (70.6%). Other categorical variables: atypical shape, submental structures and multilevel vertebral crossing were also in majority. All glottic dimensions varied significantly between two planes with most in ACom (-5.8mm) and CSA (-15.0 mm2). In contrast, plane ② manifested higher VCw (>73%), Tr (66.3%), CSA (64.2%) and AP (44.2%) measurements. On correlation analysis, variation in ACom, CSA, Tr was positively associated with VC or plane obliquity (p<0.05). This variability was more in obese and short necked subjects. Change in one parameter also modified other significantly i.e., ACom versus AP and CSA versus Tr. Results indicated statistically significant change in subjective and objective anatomical parameters of glottis on non-application of appropriate CT larynx protocol for image analysis hence highlighting importance of image reformation.


Subject(s)
Laryngeal Neoplasms , Larynx , Adult , Humans , Vocal Cords/diagnostic imaging , Vocal Cords/anatomy & histology , Retrospective Studies , Glottis/diagnostic imaging , Glottis/anatomy & histology , Larynx/diagnostic imaging , Tomography, X-Ray Computed
14.
Head Neck ; 45(12): 3129-3145, 2023 12.
Article in English | MEDLINE | ID: mdl-37837264

ABSTRACT

BACKGROUND: Accurate vocal cord leukoplakia classification is critical for the individualized treatment and early detection of laryngeal cancer. Numerous deep learning techniques have been proposed, but it is unclear how to select one to apply in the laryngeal tasks. This article introduces and reliably evaluates existing deep learning models for vocal cord leukoplakia classification. METHODS: We created white light and narrow band imaging (NBI) image datasets of vocal cord leukoplakia which were classified into six classes: normal tissues (NT), inflammatory keratosis (IK), mild dysplasia (MiD), moderate dysplasia (MoD), severe dysplasia (SD), and squamous cell carcinoma (SCC). Vocal cord leukoplakia classification was performed using six classical deep learning models, AlexNet, VGG, Google Inception, ResNet, DenseNet, and Vision Transformer. RESULTS: GoogLeNet (i.e., Google Inception V1), DenseNet-121, and ResNet-152 perform excellent classification. The highest overall accuracy of white light image classification is 0.9583, while the highest overall accuracy of NBI image classification is 0.9478. These three neural networks all provide very high sensitivity, specificity, and precision values. CONCLUSION: GoogLeNet, ResNet, and DenseNet can provide accurate pathological classification of vocal cord leukoplakia. It facilitates early diagnosis, providing judgment on conservative treatment or surgical treatment of different degrees, and reducing the burden on endoscopists.


Subject(s)
Deep Learning , Laryngeal Neoplasms , Humans , Vocal Cords/diagnostic imaging , Vocal Cords/pathology , Narrow Band Imaging/methods , Endoscopy , Laryngeal Neoplasms/pathology , Endoscopy, Gastrointestinal , Leukoplakia/diagnostic imaging , Leukoplakia/pathology , Hyperplasia/pathology
15.
Sci Data ; 10(1): 733, 2023 10 21.
Article in English | MEDLINE | ID: mdl-37865668

ABSTRACT

The endoscopic examination of subepithelial vascular patterns within the vocal fold is crucial for clinicians seeking to distinguish between benign lesions and laryngeal cancer. Among innovative techniques, Contact Endoscopy combined with Narrow Band Imaging (CE-NBI) offers real-time visualization of these vascular structures. Despite the advent of CE-NBI, concerns have arisen regarding the subjective interpretation of its images. As a result, several computer-based solutions have been developed to address this issue. This study introduces the CE-NBI data set, the first publicly accessible data set that features enhanced and magnified visualizations of subepithelial blood vessels within the vocal fold. This data set encompasses 11144 images from 210 adult patients with pathological vocal fold conditions, where CE-NBI images are annotated using three distinct label categories. The data set has proven invaluable for numerous clinical assessments geared toward diagnosing laryngeal cancer using Optical Biopsy. Furthermore, given its versatility for various image analysis tasks, we have devised and implemented diverse image classification scenarios using Machine Learning (ML) approaches to address critical clinical challenges in assessing laryngeal lesions.


Subject(s)
Laryngeal Neoplasms , Laryngoscopy , Larynx , Adult , Humans , Laryngeal Neoplasms/diagnostic imaging , Laryngeal Neoplasms/pathology , Larynx/diagnostic imaging , Narrow Band Imaging , Vocal Cords/diagnostic imaging
16.
Vestn Otorinolaringol ; 88(4): 25-39, 2023.
Article in Russian | MEDLINE | ID: mdl-37767588

ABSTRACT

Fiberoptic laryngoscopy is a standard procedure for evaluation of vocal folds immobility. However, this method is invasive, requires special qualifications and technical equipment, which limits its routine use. Therefore, in daily practice, the vast majority of laryngoscopy are performed by an indirect way, the accuracy of which depends on the specialist experience and the patient compliance. On the other hand, a large number of patients require for a convenient, non-invasive and inexpensive approach to assess the vocal folds mobility. The transcutaneous laryngeal ultrasonography can be such a method. However, the disadvantage of this technique is low informative value. OBJECTIVE: To increase the effectiveness of the diagnosis of laryngeal dysfunction using transcutaneous laryngeal ultrasonography. MATERIAL AND METHODS: Patients underwent laryngeal ultrasonography and videolaryngoscopy before and after thyroid or parathyroid surgery. Ultrasound was performed polypositionally in the transverse and oblique planes. Functional tests with breathing and breath holding were used. Qualitative (the smile or flying bird signs, the vertical closing line of the vocal folds, synchronicity and symmetry movement of the arytenoid cartilages) and quantitative (the length contraction of the vocal cord, the rotation angle of the arytenoid cartilage) ultrasonic parameters determin the normal vocal folds mobility. RESULTS: 996 patients were included in the study. Vocal folds paresis was detected in 106 (10.6%) patients. In 72 (7.2%) cases partial impaired mobility of the vocal folds (laryngeal dyskinesia) were detected. The echographic patterns of these patients were analyzed. Qualitative ultrasound signs of laryngeal dysfunction were identified: a crooked smile or falling bird signs, a closing line deformation of the vocal folds, an arytenoid immobility. Quantitative ultrasound signs included: a decrease in the length contraction of the vocal cord and a reduction of rotation angle of the arytenoid cartilage. Unilateral laryngeal paresis was diagnosed in 101 (10.1%) patients. In unilateral disorders the rotation angle of the arytenoid on the affected side was 0-14° and the length contraction of the vocal cord was 0-1.8 mm. A crooked smile or falling bird signs, a closing line deformation of the vocal folds and immobility of the arytenoid cartilages were also determined. In 5 (0.5%) cases bilateral laryngeal paresis was revealed, in which on both sides the rotation angles of the arytenoid were 0-14°, and the length contraction of the vocal cords was 0-1.8 mm. At the same time there was no a smile or flying bird signs and a closing line of the vocal folds. Laryngeal dyskinesia was characterized by a crooked smile or falling bird signs and a closing line deformation of the vocal folds. At the same time, partial mobility of the arytenoid cartilage was noted in comparison with the contralateral side (there was a difference in the rotation angle of the arytenoid between the right and left sides of 15 ° or more degrees). CONCLUSION: The sensitivity and specificity polypositional ultrasound of the vocal folds in women were 100% and 99.8%, in men - 85.7% and 99.2%, respectively.


Subject(s)
Dyskinesias , Larynx , Vocal Cord Paralysis , Male , Humans , Female , Vocal Cords/diagnostic imaging , Larynx/diagnostic imaging , Vocal Cord Paralysis/diagnostic imaging , Vocal Cord Paralysis/etiology , Ultrasonography
17.
J Biomed Opt ; 28(8): 087002, 2023 08.
Article in English | MEDLINE | ID: mdl-37560326

ABSTRACT

Significance: The vocal folds are critically important structures within the larynx which serve the essential functions of supporting the airway, preventing aspiration, and phonation. The vocal fold mucosa has a unique multilayered architecture whose layers have discrete viscoelastic properties facilitating sound production. Perturbations in these properties lead to voice loss. Currently, vocal fold pliability is inferred clinically using laryngeal videostroboscopy and no tools are available for in vivo objective assessment. Aim: The main objective of the present study is to evaluate viability of Brillouin microspectroscopy for differentiating vocal folds' mechanical properties against surrounding tissues. Approach: We used Brillouin microspectroscopy as an emerging optical imaging modality capable of providing information about local viscoelastic properties of tissues in noninvasive and remote manner. Results: Brillouin measurements of the porcine larynx vocal folds were performed. Elasticity-driven Brillouin spectral shifts were recorded and analyzed. Elastic properties, as assessed by Brillouin spectroscopy, strongly correlate with those acquired using classical elasticity measurements. Conclusions: These results demonstrate the feasibility of Brillouin spectroscopy for vocal fold imaging. With more extensive research, this technique may provide noninvasive objective assessment of vocal fold mucosal pliability toward objective diagnoses and more targeted treatments.


Subject(s)
Larynx , Vocal Cords , Animals , Swine , Vocal Cords/diagnostic imaging , Phonation , Elasticity , Spectrum Analysis
18.
World J Surg ; 47(11): 2792-2799, 2023 11.
Article in English | MEDLINE | ID: mdl-37540267

ABSTRACT

BACKGROUND: Vocal cord paresis (VCP) is a serious complication after esophagectomy. Conventional diagnosis of VCP relies on flexible laryngoscopy (FL), which is invasive. Laryngeal ultrasonography (LUSG) is non-invasive and convenient. It has provided accurate VC evaluation after thyroidectomy but it is unclear if it is just as accurate following esophagectomy. This prospective study evaluated the feasibility and accuracy of LUSG in VC assessment on day-1 after esophagectomy. METHODS: Consecutive patients from a tertiary teaching hospital who underwent elective esophagectomy were prospectively recruited. All received pre-operative FL, and post-operative LUSG and FL on Day-1, each performed by a blinded, independent assessor. The primary outcomes were feasibility and accuracy of LUSG in the diagnosis of VCP on Day-1 post-esophagectomy. The accuracy of voice assessment (VA) was analyzed. RESULTS: Twenty-six patients were eligible for analysis. The median age was 70 years (66-73). Majority were male (84.6%). Twenty-five (96.2%) received three-phase esophagectomy. Twenty-four (96%) had same-stage anastomosis at the neck. Three (11.5%) developed temporary and one (3.8%) developed permanent unilateral VCP. Overall VC visualization rate by LUSG was 100%; sensitivity, specificity, positive predictive value, negative predictive value (NPV) and accuracy of LUSG were 75.0%, 100%, 100%, 98.0%, 98.1% respectively, and superior to VA. Combining LUSG with VA findings could pick up all VCPs i.e. improved sensitivity and NPV to 100%. CONCLUSION: LUSG is a highly feasible, accurate and non-invasive method to evaluate VC function early after esophagectomy. Post-operative FL may be avoided in patients with both normal LUSG and voice.


Subject(s)
Vocal Cord Paralysis , Vocal Cords , Humans , Male , Female , Aged , Vocal Cords/diagnostic imaging , Prospective Studies , Esophagectomy/adverse effects , Feasibility Studies , Vocal Cord Paralysis/diagnostic imaging , Vocal Cord Paralysis/etiology , Laryngoscopy , Ultrasonography , Thyroidectomy/adverse effects
19.
J Speech Lang Hear Res ; 66(9): 3276-3289, 2023 09 13.
Article in English | MEDLINE | ID: mdl-37652062

ABSTRACT

OBJECTIVE: An experiment with controllable boundaries was designed to assess the influence of the recording angle and distance on two-dimensional (2D) imaging in laryngoscopy and resulting 2D parameter calculation derived from the glottal area waveform (GAW). METHOD: Two high-speed camera setups were used to synchronously record an oscillating synthetic vocal fold (VF) model, simulating a high-speed videoendoscopy. One camera recorded at variable lateral recording angles and a reference camera in superior perspective. This was performed at different physiological recording distances and for two oscillation modes (with/without contacting VFs). The GAW was derived from the segmented glottis, and two parameters each for the categories of symmetry, periodicity, and closure were calculated, as well as two derivative measures. The percentage difference between the variable and reference camera value pairs was calculated, and the angle and height dependencies were quantified using linear regression. RESULTS: The visual perception of a laryngoscopy was found to be influenced by the lateral recording angle, which may lead to misinterpretation of VF symmetry among inexperienced observers. The strongest influence of recording angle was observed for symmetry parameters, the strongest being the Amplitude Symmetry Index with up to 2.6%/° (p < .05). A dependence on the recording distance was only found for the Maximum Area Declination Rate. CONCLUSIONS: The recording angle in 2D laryngoscopy should be carefully considered during visual inspection of the VF dynamics. Most of the investigated objective parameters were unaffected by the examined perspective distortion. However, especially left-right symmetry measures should only be used under controlled boundary conditions to avoid misdiagnosis and misinterpretation. SUPPLEMENTAL MATERIAL: https://doi.org/10.23641/asha.23961183.


Subject(s)
Glottis , Laryngoscopy , Humans , Glottis/diagnostic imaging , Vocal Cords/diagnostic imaging , Linear Models , Reference Values
20.
Ann Biomed Eng ; 51(10): 2182-2191, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37261591

ABSTRACT

Type I thyroplasty is widely used to improve voice production in patients affected by unilateral vocal fold paralysis. Almost two-thirds of laryngologists report using Silastic® implants to medialize the vocal fold, with implant size, shape, and location determined experientially. However, post-surgical complications arising from this procedure (extrusion, migration, resizing) necessitate revision in 4.5-16% of patients. To improve initial surgical outcomes, we have developed a subject-specific modeling tool, PhonoSim, which uses model reconstruction from MRI scans to predict the optimal implantation location. Eleven vocal fold sample sides from eight larynges of New Zealand white rabbits were randomized to two groups: PhonoSim informed (n = 6), and control (no model guidance, n = 5). Larynges were scanned ex vivo in the abducted configuration using a vertical-bore 11.7 T microimaging system, and images were used for subject-specific modeling. The PhonoSim tool simulated vocal fold adduction for multiple implant location placements to evaluate vocal fold adduction at the medial surface. The best implant placement coordinates were output for the 6 samples in the PhonoSim group. Control placements were determined by the same surgeon based on anatomical landmarks. Post-surgical MRI scans were performed for all samples to evaluate medialization in implanted vocal folds. Results show that PhonoSim-guided implantation achieved higher vocal fold medialization relative to controls (28 to 55% vs. - 29 to 39% respectively, in the glottal area reduction), suggesting that this tool has the potential to improve outcomes and revision rates for type I thyroplasty.


Subject(s)
Laryngoplasty , Vocal Cord Paralysis , Animals , Rabbits , Laryngoplasty/adverse effects , Laryngoplasty/methods , Prostheses and Implants/adverse effects , Prosthesis Implantation/adverse effects , Prosthesis Implantation/methods , Vocal Cord Paralysis/diagnostic imaging , Vocal Cord Paralysis/surgery , Vocal Cord Paralysis/etiology , Vocal Cords/diagnostic imaging , Vocal Cords/surgery
SELECTION OF CITATIONS
SEARCH DETAIL
...